Search CORE

23 research outputs found

Towards Automatically Addressing Self-Admitted Technical Debt: How Far Are We?

Author: Bavota Gabriele
Di Penta Massimiliano
Mastropaolo Antonio
Publication venue
Publication date: 17/08/2023
Field of study

Upon evolving their software, organizations and individual developers have to spend a substantial effort to pay back technical debt, i.e., the fact that software is released in a shape not as good as it should be, e.g., in terms of functionality, reliability, or maintainability. This paper empirically investigates the extent to which technical debt can be automatically paid back by neural-based generative models, and in particular models exploiting different strategies for pre-training and fine-tuning. We start by extracting a dateset of 5,039 Self-Admitted Technical Debt (SATD) removals from 595 open-source projects. SATD refers to technical debt instances documented (e.g., via code comments) by developers. We use this dataset to experiment with seven different generative deep learning (DL) model configurations. Specifically, we compare transformers pre-trained and fine-tuned with different combinations of training objectives, including the fixing of generic code changes, SATD removals, and SATD-comment prompt tuning. Also, we investigate the applicability in this context of a recently-available Large Language Model (LLM)-based chat bot. Results of our study indicate that the automated repayment of SATD is a challenging task, with the best model we experimented with able to automatically fix ~2% to 8% of test instances, depending on the number of attempts it is allowed to make. Given the limited size of the fine-tuning dataset (~5k instances), the model's pre-training plays a fundamental role in boosting performance. Also, the ability to remove SATD steadily drops if the comment documenting the SATD is not provided as input to the model. Finally, we found general-purpose LLMs to not be a competitive approach for addressing SATD

arXiv.org e-Print Archive

Legal Documents Categorization by Compression

Author: Mastropaolo Antonio
Pallante Francesco
Radicioni Daniele Paolo
Publication venue: ACM - Association for Computing Machinery
Publication date: 01/01/2013
Field of study

Institutional Research Information System University of Turin

Toward Automatically Completing GitHub Workflows

Author: Bavota Gabriele
Di Penta Massimiliano
Mastropaolo Antonio
Zampetti Fiorella
Publication venue
Publication date: 06/09/2023
Field of study

Continuous integration and delivery (CI/CD) are nowadays at the core of software development. Their benefits come at the cost of setting up and maintaining the CI/CD pipeline, which requires knowledge and skills often orthogonal to those entailed in other software-related tasks. While several recommender systems have been proposed to support developers across a variety of tasks, little automated support is available when it comes to setting up and maintaining CI/CD pipelines. We present GH-WCOM (GitHub Workflow COMpletion), a Transformer-based approach supporting developers in writing a specific type of CI/CD pipelines, namely GitHub workflows. To deal with such a task, we designed an abstraction process to help the learning of the transformer while still making GH-WCOM able to recommend very peculiar workflow elements such as tool options and scripting elements. Our empirical study shows that GH-WCOM provides up to 34.23% correct predictions, and the model's confidence is a reliable proxy for the recommendations' correctness likelihood

arXiv.org e-Print Archive

Automatically Generating Dockerfiles via Deep Learning: Challenges and Promises

Author: Bavota Gabriele
Mastropaolo Antonio
Oliveto Rocco
Rosa Giovanni
Scalabrino Simone
Publication venue
Publication date: 28/03/2023
Field of study

Containerization allows developers to define the execution environment in which their software needs to be installed. Docker is the leading platform in this field, and developers that use it are required to write a Dockerfile for their software. Writing Dockerfiles is far from trivial, especially when the system has unusual requirements for its execution environment. Despite several tools exist to support developers in writing Dockerfiles, none of them is able to generate entire Dockerfiles from scratch given a high-level specification of the requirements of the execution environment. In this paper, we present a study in which we aim at understanding to what extent Deep Learning (DL), which has been proven successful for other coding tasks, can be used for this specific coding task. We preliminarily defined a structured natural language specification for Dockerfile requirements and a methodology that we use to automatically infer the requirements from the largest dataset of Dockerfiles currently available. We used the obtained dataset, with 670,982 instances, to train and test a Text-to-Text Transfer Transformer (T5) model, following the current state-of-the-art procedure for coding tasks, to automatically generate Dockerfiles from the structured specifications. The results of our evaluation show that T5 performs similarly to the more trivial IR-based baselines we considered. We also report the open challenges associated with the application of deep learning in the context of Dockerfile generation

arXiv.org e-Print Archive

Legal documents categorization by compression

Author: Mastropaolo Antonio
Pallante Francesco
Radicioni DANIELE PAOLO
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Crossref

Institutional Research Information System University of Turin

La sottodeterminazione nei testi giuridici: verso un'analisi linguistico-computazionale

Author: Mastropaolo Antonio
Radicioni Daniele Paolo
Revelli Luisa
Publication venue: AItLA - Associazione Italiana di Linguistica Applicata
Publication date: 01/01/2022
Field of study

Institutional Research Information System University of Turin

Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks

Author: Bavota Gabriele
Cooper Nathan
Mastropaolo Antonio
Oliveto Rocco
Palacio David Nader
Poshyvanyk Denys
Scalabrino Simone
Publication venue
Publication date: 01/01/2021
Field of study

Deep learning (DL) techniques are gaining more and more attention in the software engineering community. They have been used to support several code-related tasks, such as automatic bug fixing and code comments generation. Recent studies in the Natural Language Processing (NLP) field have shown that the Text-To-Text Transfer Transformer (T5) architecture can achieve state-of-the-art performance for a variety of NLP tasks. The basic idea behind T5 is to first pre-train a model on a large and generic dataset using a self-supervised task ( e.g: filling masked words in sentences). Once the model is pre-trained, it is fine-tuned on smaller and specialized datasets, each one related to a specific task ( e.g: language translation, sentence classification). In this paper, we empirically investigate how the T5 model performs when pre-trained and fine-tuned to support code-related tasks. We pre-train a T5 model on a dataset composed of natural language English text and source code. Then, we fine-tune such a model by reusing datasets used in four previous works that used DL techniques to: (i) fix bugs, (ii) inject code mutants, (iii) generate assert statements, and (iv) generate code comments. We compared the performance of this single model with the results reported in the four original papers proposing DL-based solutions for those four tasks. We show that our T5 model, exploiting additional data for the self-supervised pre-training phase, can achieve performance improvements over the four baselines.Comment: Accepted to the 43rd International Conference on Software Engineering (ICSE 2021

arXiv.org e-Print Archive

Università degli Studi del Molise: IRIS

Polymorphism: an evaluation of the potential risk to the quality of drug products from the Farmácia Popular Rede Própria

Author: ?EHI? S.
AALTONEN J.
ACHARYA K.R.
AGATONOVIC-KUSTRIN S.
AGATONOVIC-KUSTRIN S.
AGUIAR D.L.M.
AGUIAR J.A.
AITIPAMULA S.
AL-ZEIN H.
ALKHAMIS K.A.
AMIDON G.L.
Antonio Carlos Doriguetto
ARLIN J-B.
ASNANI M.
AULTON M.E.
BABU N.J.
BARTOLOMEI M.
BAUER J.
BAWAZIR S.A.
BELSKY A.
BERNSTEIN J.
BILTON C.
BLAGDEN N.
BLAKE A.J.
BLANCO M.
BOLES M.O.
BOLTEN D.
BOND A.D.
BOND A.D.
BONFILIO R.
BORODI G.
BREDIKHIN A.A.
BRITS M.
BYRN S.R.
BYRN S.R.
CAIRA M.R.
CAMERMAN A.
CAMPBELL JR. G.C.
CARPY A.
CARSTENSEN J.T.
CARSTENSEN J.T.
CEJKA J.
CHANANONT P.
CHANDAVARKAR N.M.
CHANG Y.P.
CHANG Y.S.
CHAROENLARP P.
CHEMBURKAR S.R.
CHIENG N.
CHIENG N.
CHILDS S.L.
CHIOU W.L.
CODDING P.W.
CODY V.
COOK W.J.
COSTA J.
CROCKER L.S.
DAVIDSON A.G.A.
DEROLLEZ P.
DESIRAJU G.R.
DESIRAJU G.R.
DESIRAJU G.R.
DEXTER D.D.
DI MARTINO P.
DI MARTINO P.
DIAO Y.
DOLITZKY B.Z.
DUDOGNON E.
EBERHARD N.
ERK P.
ESTEVES DE CASTRO R.A.E.
EYJOLFSSON R.
EYJOLFSSON R.
FABBIANI F.P.A.
FABBIANI F.P.A.
FABBIANI F.P.A.
FABBIANI F.P.A.
FEKETE P
FERNÁNDEZ D.
FERREIRA F.F.
FILIP L.A.
FINKELSTEIN N.
FIORITTO A.F.
FLICKER F.
FLORENCE A.T.
FOSTER A.
FOTAKI N.
FROEHLICH P.E.
FUKUMORI Y.
GALA D.
GALVÁN-TEJADA N.
GELBRICH T.
GILLON A.L.
GO K.
GOLDBEK G.
GRANT D.J.W.
GRIESSER U.J
GROOFF D.
GRZESIAK A.
GUGUTA C.
GUNN E.
HAISA M.
HANCOCK B.C.
HAOMING H.
HARRIS R.K.
HARTAUER K.J.
HEMPEL A.
HIMES V.L.
HOWARD S.T.
HUSAK M.
JAVADZADEH Y.
Jennifer Tavares Jacon
Juliana Savioli Simões
KANG Y.
KANTERS J.A.
KARPINSKI P.H.
KASIM N.A.
KENNEDY A.R.
KHUNT M.D.
KIANG Y.H.
KIECZYKOWSKI G.R.
KIPOUROS K.
KLEIN C.L.
KLEIN C.L.
KOBAYASHI Y.
KOETZLE T.F.
KOGAN A.
KRISTL A.
LAKE O.A.
LANG M.
LEE A.Y.
LEECH C.K.
LEGENDRE A.O.
LIEBENBERG W.
LINDENBERG M.
LISGARTEN J.N.
LLINÀS A.
LOWES M.M.
LULLA A.
LUTKER K.M.
MADAN T.
MAFRA L.
MAGGIO R.M.
Maria Esther Dias Reis
MARTINS F.T.
MASTROPAOLO D.
MATSUDA Y.
MAURY L.
MCDOWELL J.J.H.
MCGREGOR P.A.
MEYER M.C.
MEYER M.C.
MIROSHNYK I.
MITTAPALLI P.K.
MOHAMED S.
MONTEJO-BERNARDO J.M.
MOSTAD A.
MOSTAD A.
MOULTON B.
Mônica Esselin de Sousa Lino
NAUMOV D.Y.
NEUMAN P.A.
NGOOI T.K.
NICHOLS G.
NOKHODCHI A.
NUNN T.
OHISHI H.
Olímpia Maria Martins Santos
OTSUKA M.
OTSUKA M.
OTSUKA M.
PABÓN C.V.
PALACIO M.A.
PARK H.J.
PARK H.J.
PARKIN A.
PEDERSEN M.
PEETERS O.M.
PEETERS O.M.
PEETERS O.M.
PETERSON M.L.
PETERSON M.L.
PHADNIS N.V.
PHADNIS N.V.
PLATTEAU C.
PRANZO M.B.
PRASANNA M.D.
PRICE C.P.
PRUSINER P.
PRÉCIGOUX G.
PUDIPEDDI M.
PUROHIT R.
QUIST F.
RAO D.R.
RAW A.S.
REBOUL J.P.
REISCH J.
ROBERTSON D.W.
RODRIGUEZ-CAABEIRO F.
RUSTICHELLI C.
RUSTICHELLI C.
SHANKLAND N.
SHEN J.
SHIKII K.
SHINGAL D.
SNIDER D.A.
SOHN Y.-T.
SOHN Y.-T.
STEPHENSON G.A.
STEPHENSON G.A.
STONE K.H.
SUITCHMEZIAN V.
SWANEPOEL E.
TAKASUKA M.
TAWASHI R.
TESSLER L.
THAYER A.M.
TUREL I.
TUTUGHAMIARSO M.
UEKAMA K.
VARIANKAVAL N.E.
VARUGHESE S.
VEGA D.
VILLIERS M.M.
VISHWESHWAR P.
VOGT M.
VON RAUMER M.
WANG M.
WEHNER H.L.
WU C.
WU L.S.
YOSHIDA M.I.
YU L.
YU L.X.
ZENCIRCI N.
ZENCIRCI N.
ZHENHONG W.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date: 01/03/2014
Field of study

Polymorphism in solids is a common phenomenon in drugs, which can lead to compromised quality due to changes in their physicochemical properties, particularly solubility, and, therefore, reduce bioavailability. Herein, a bibliographic survey was performed based on key issues and studies related to polymorphism in active pharmaceutical ingredient (APIs) present in medications from the Farmácia Popular Rede Própria. Polymorphism must be controlled to prevent possible ineffective therapy and/or improper dosage. Few mandatory tests for the identification and control of polymorphism in medications are currently available, which can result in serious public health concerns

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Cadernos Espinosanos (E-Journal)

(Le regole da Gentile)...a Gelmini

Author: Mastropaolo Antonio
Sobrino Giorgio Giuseppe
Publication venue
Publication date: 01/01/2009
Field of study

Institutional Research Information System University of Turin

Incertezze derivanti dalla ineliminabile, ma non adeguatamente contenuta, oscurità linguistica delle disposizioni normative

Author: Longo Fabio
Mastropaolo Antonio
Pallante Francesco
Publication venue: G. Giappichelli Editore srl
Publication date: 01/01/2012
Field of study

Institutional Research Information System University of Turin